Entropy of finite random binary sequences with weak long-range correlations

نویسندگان

  • S. S. Melnik
  • O. V. Usatenko
چکیده

We study the N-step binary stationary ergodic Markov chain and analyze its differential entropy. Supposing that the correlations are weak we express the conditional probability function of the chain through the pair correlation function and represent the entropy as a functional of the pair correlator. Since the model uses the two-point correlators instead of the block probability, it makes it possible to calculate the entropy of strings at much longer distances than using standard methods. A fluctuation contribution to the entropy due to finiteness of random chains is examined. This contribution can be of the same order as its regular part even at the relatively short lengths of subsequences. A self-similar structure of entropy with respect to the decimation transformations is revealed for some specific forms of the pair correlation function. Application of the theory to the DNA sequence of the R3 chromosome of Drosophila melanogaster is presented.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Entropy and long-range correlations in random symbolic sequences

The goal of this paper is to develop an estimate for the entropy of random long-range correlated symbolic sequences with elements belonging to a finite alphabet. As a plausible model, we use the high-order additive stationary ergodic Markov chain. Supposing that the correlations between random elements of the chain are weak we express the differential entropy of the sequence by means of the sym...

متن کامل

Entropy and long-range correlations in DNA sequences

We analyze the structure of DNA molecules of different organisms by using the additive Markov chain approach. Transforming nucleotide sequences into binary strings, we perform statistical analysis of the corresponding "texts". We develop the theory of N-step additive binary stationary ergodic Markov chains and analyze their differential entropy. Supposing that the correlations are weak we expre...

متن کامل

Repeat Sequences and Base Correlations in Human Y Chromosome Palindromes

On the basis of information theory and statistical methods, we use mutual information, ntuple entropy and conditional entropy, combined with biological characteristics, to analyze the long range correlation and short range correlation in human Y chromosome palindromes. The magnitude distribution of the long range correlation which can be reflected by the mutual information is P5>P5a>P5b (P5a an...

متن کامل

Phase-Transition in Binary Sequences with Long-Range Correlations

Motivated by novel results in the theory of correlated sequences, we analyze the dynamics of random walks with long-term memory (binary chains with long-range correlations). In our model, the probability for a unit bit in a binary string depends on the fraction of unities preceding it. We show that the system undergoes a dynamical phase-transition from normal diffusion, in which the variance DL...

متن کامل

On the non-randomness of maximum Lempel Ziv complexity sequences of finite size

Random sequences attain the highest entropy rate. The estimation of entropy rate for an ergodic source can be done using the Lempel Ziv complexity measure yet, the exact entropy rate value is only reached in the infinite limit. We prove that typical random sequences of finite length fall short of the maximum Lempel-Ziv complexity, contrary to common belief. We discuss that, for a finite length,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Physical review. E, Statistical, nonlinear, and soft matter physics

دوره 90 5-1  شماره 

صفحات  -

تاریخ انتشار 2014